Study of Indexing Techniques to Improve the Performance of Information Retrieval in Telugu Language
نویسنده
چکیده
Information Retrieval Systems (IRS) are so popular through World Wide Web. Availability of Text Information related to all types of objects like Documents, Web Pages, Images, Videos and Audio files on web are increasing day by day in an exponential manner. When the text repository grows to the maximum extent of the memory size in the server, the methods used to find a particular text unit either word or document is tedious task. Representation of these objects, using text information gives summarized features to decide whether to access the identified unit or not in the first look. Instead of exact query match in the document a set of keywords will be used to find the relevance of the document. If a set of keywords represents a document, then it is easy to match the couple of keywords from the query against keywords of the document and decide the relevance. Finding keywords to represent a complete unit is called index. Keyword are your own designated units which can be used for easy location of the document using any search engines. A keyword maps all the documents containing this indexed word. This problem is addressed by identifying indexed words or phrases of a document. Indexing terms together represents whole document and act as ambassadors of the unit. In this paper we studied the effect of various indexing techniques , namely , manual , automatic and semi-automatic on 10,000 Telugu text documents. Statistical Indexing is taken as base line approach and compared the results with other techniques. We observed that, the results are better plotted while moving from statistical representations to semantic representations. Keywords—Keywords, Indexing Terms, Manual Indexing, Automatic Indexing, Statistical Indexing, Semantic based Indexing, Telugu Text Corpus, N-gram, Inverted File Structure.
منابع مشابه
وضعیت بازیابی اطلاعات در دو پایگاه نمایه و نما و سنجش اثربخشی استفاده از واژگان کنترل شده در نمایهسازی این دو پایگاه
Purpose: This study was carried out to determine the level of precision, recall, and searching time for “Nama” and “Namayeh” databases, as well as to find out which of the indexing tools (thesaurus and Dewey decimal classification) helps us more in improvement of information retrieval. Methodology: This study is an analytical survey in which the necessary data was collected by direct observati...
متن کاملAn Approach for Improving Execution Performance in Inference Network Based Information Retrieval
The inference network retrieval model provides the ability to combine a variety of retrieval strategies expressed in a rich query language. While this power yields impressive retrieval effectiveness, it also presents barriers to the incorporation of traditional optimization techniques intended to improve the execution efficiency, or speed, of retrieval. The essence of these optimization techniq...
متن کاملBitmap Indexing-based Clustering and Retrieval of XML Documents
This paper describes a bitmap indexing based technique to cluster XML documents. XML documents can be hierarchically represented by elements. To improve performance of information retrieval, documents can be indexed using bitmap techniques. Such a bitmap index is sparse, meaning it contains unnecessarily many zero bits, especially for the word dimension. To remove zero bits and improve the perf...
متن کاملContent Based Radiographic Images Indexing and Retrieval Using Pattern Orientation Histogram
Introduction: Content Based Image Retrieval (CBIR) is a method of image searching and retrieval in a database. In medical applications, CBIR is a tool used by physicians to compare the previous and current medical images associated with patients pathological conditions. As the volume of pictorial information stored in medical image databases is in progress, efficient image indexing and retri...
متن کاملبررسی تأثیر نمایهسازی مفهوم-محور تصاویر بر بازیابی آنها با استفاده از موتور جستجوی گوگل
Purpose: The purpose of the present study is to investigate the Impact of Concept-based Image Indexing on Image Retrieval via Google. Due to the importance of images, this article focuses on the features taken into account by Google in retrieving the images. Methodology: The present study is a type of applied research, and the research method used in it comes from quasi-experimental and techno...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013